An Agent-based Architecture for a Multimodal Interface. in Working Notes of the Aaai-94 Spring Symposium on Intelligent
نویسنده
چکیده
mouse, and speech for input, and audio and screen-based feedback for output. Multimedia presentation. The system makes use of a number of dierent media for human-computer interaction; in particular text, graphics, animation, and prerecorded video (the latter acting as a \cheap", but nevertheless very eective, substitute for in-oce video cameras and/or active badges 3) Communications-oriented capabilities. The system enables transparent communication across dierent computer platforms (Macintosh, Unix) and facilitates the interconnection of system users via telephone, email, and voice messaging. A number of features of the CALVIN architecture have proven useful for developing multimodal applications that integrate a number of distributed media resources. In particular, rapid responses to users' commands are facilitated through integration of appropriate reactive behaviors in the system's Interface and User agents (Ferguson and Davlouros 1995); in addition , blending of complementary input modalities is facilitated through the execution of multiple concurrent agents (which in turn are able to execute multiple concurrent, task-specic behaviors). Current work already underway includes porting the graphical user interface portion of the PeopleFinder to both PC and Unix platforms (in the interest of extending the tool's audience and ensuring a more thorough testing and empirical evaluation phase of the project); integrating a number of other software applications such as teleconferencing, voice dictation, and video camera-based face recognition; extending agents' capabilities for autonomously resolving run-time con BLOCKINicts resulting from shared access to the dierent presentation and communications resources used by the system (see Werkman's KBN negotiation-based con BLOCKINict resolution work for related issues (Werkman 1994)); and formalizing the various rules used by the PeopleFinder to combine multiple media with multiple modalities for both human-computer interaction and computer supported human-human communication, much along the lines of the work of Arens et al. (Arens et al. 1993) on allocating multiple media. The tool is implemented using a variety of dier-ent scripting languages (AppleScript, Quickeys, and C-shell) and runs on a Macintosh Quadra 840 AV. The tool also makes use of the Macintosh's Apple Phone tool and Geoport Telecom Adapter for performing its various computer-telephony integration tasks.
منابع مشابه
Tion [fj94] Claudie Faure and Luc Julia. an Agent-based Architecture for a Multimodal Interface. in Working Notes of the Aaai-94 Spring Symposium on Intelligent Multi-media Multi-modal Sys- Tems, 5. Summary
The PeopleFinder is a knowledge-based tool to assist users in determining the whereabouts of other staff located in an office or network environment. The tool makes use of several modes of input and output, as well as employing a number of interface and communications media with which to present information and interconnect remote system users. An accompanying video contains example uses of the...
متن کاملDALI: An Architecture for Intelligent Logical Agents
Many interesting architectures for defining intelligent agents have been proposed in the last years. Logic-based architectures have proved effective for reproducing “intelligent” behavior while staying within a rigorous formal setting. In this paper, we present the DALI multi-agent architecture, a logic framework for defining intelligent agents and multi-
متن کاملA Context-aware Architecture for Mental Model Sharing through Semantic Movement in Intelligent Agents
Recent studies in multi-agent systems are paying increasingly more attention to the paradigm of designing intelligent agents with human inspired concepts. One of the main cognitive concepts driving the core of many recent approaches in multi agent systems is shared mental models. In this paper, we propose an architecture for sharing mental models based on a new concept called semantic movement....
متن کاملModeling Human-Agent Interaction with Active Ontologies
As computer systems continue to grow in power and access more networked content and services, we believe there will be an increasing need to provide more user-centric systems that act as intelligent assistants, able to interact naturally with human users and with the information environment. Building such systems is a difficult task that requires expertise in many AI fields, ranging from reason...
متن کاملIntegration of Visuomotor Learning, Cognitive Grasping and Sensor-Based Physical Interaction in the UJI Humanoid Torso
We present a high-level overview of our research efforts to build an intelligent robot capable of addressing real-world problems. The UJI Humanoid Robot Torso integrates research accomplishments under the common framework of multimodal active perception and exploration for physical interaction and manipulation. Its main components are three subsystems for visuomotor learning, object grasping an...
متن کامل